# High Performance Digital Pulsewidth-Control Circuit With Programmable Duty Cycle

Vandana.M<sup>1</sup>, V.Deepika<sup>2</sup>

<sup>1</sup>(Department of Electronics and Communication Engineering, Anna University, India) <sup>2</sup>(Department of Electronics and Communication Engineering, Anna University, India)

**ABSTRACT :** In High speed operations the duty cycle of the clock signal is to bé calibrated at 50%. But the variations in process, voltage and temperature (PVT) influences the duty cycle and make it difficult to calibrate the duty cycle at 50%. To overcome this deviation Pulse width control loops (PWCLs) are used. This work presents a high performance and fast locking all digital pulse width control circuit with programmable duty cycle. For the pulse width control circuit, two delay lines and a time to digital detector is used which reduces the amount of hardware required in the circuit. The output duty cycle is calculated with the help of a new duty cycle setting circuit without the need for a look-up table. The new design is developed in Hardware description language (HDL) to improve the design effort. The pulsewidth-control circuit is capable of operating over a wide frequency range with fewer delay cells. The reliability of the circuit is increased by using a TMR system. Experimental results show that the proposed approach is consuming less area and power when compared with the previous methods and the circuit is reliable.

Keywords – Duty cycle, pulsewidth-control circuit. TMR system, time-to-digital detector.

## 1. INTRODUCTION

To meet the demand for high-speed operation today, many systems such as DDR SDRAM adopt double data rate (DDR) technology. In these systems, both rising and falling edges of the clock are used to sample the input data, requiring that the duty cycle of the clock to be precisely maintained at 50%. Variations in process, voltage, and temperature (PVT) may influence the duty cycle of the clock signal, making it difficult to calibrate the duty cycle precisely at 50%. Overcoming the deviations from a 50% duty cycle is an important problem in further development of high speed operations.

Pulsewidth control loops (PWCLs) can be used to overcome this problem. A conventional PWCL was produced using a built ring oscillator. Due to the variations in PVT the duty cycle of the ring oscillator deviated widely. Locking time of the circuit was also an important factor. Although a low- voltage PWCL is capable of operating with short locking time, it requires a clock with 50% duty cycle for the reference signal. In low-jitter mutual correlated PWCL, the limitations of 50% duty cycle is avoided by using a single-to-differential circuit. However, the duty cycle of the above scheme is fixed, and so is not available for adjustment of duty cycle.

Many systems such as DAC and ADC require a reference clock with programmable duty cycle. The digital pulsewidth control loop (PWCL) was first proposed to overcome the shortcomings of the conventional PWCL. The all-digital PWCL is designed with the advantage of scaling CMOS technologies. It however has two disadvantages. 1) it requires 28 reference cycles to be locked. 2) the programmable duty cycle requires a look-up table to generate corresponding duty cycles. To overcome the above disadvantages of digital PWCL, a new all-digital Pulsewidth control circuit with the programmable duty cycle is introduced. It provides a new duty cycle setting circuit that calculates the desired output duty cycle without the need for a look-up table. But the system was not so reliable.

This paper proposes a new high performance digital pulsewidth control circuit with the programmable duty cycle. The reliability of the circuit is increased by concatenating the pulsewidth-control circuit with TMR system. The major benefits of our high performance pulsewidth control circuit are: 1) the circuit achieves fast locking; 2) the circuit requires only 7-11 reference cycles to be in locked state; 3) the pulsewidth control circuit is capable of operating over a wide frequency range with fewer delay cells; 4) amount of hardware required is reduced by the use of delay lines and a time-to-digital detector; 5) reliability of the pulsewidth control circuit is increased by using Triple modular redundancy systems. By increasing the reliability, the impact of transient and permanent faults is minimized.

The remainder of this paper is organized as follows. The proposed architecture will be discussed in Section II. Section III presents the main building blocks and its operations. Section IV illustrates Experimental analysis. Conclusions of the paper are presented in Section V.

#### 2. PROPOSED CIRCUIT ARCHITECTURE

The proposed high performance digital pulsewidth-control circuit with the programmable duty cycle is shown in Fig. 1. Overall system is decomposed into five main function blocks: 1) a coarse pulsewidth identification circuit (CPI), 2) a coarse delay line (CDL) and a coarse detector, 3) a fine delay line (FDL) and a fine detector, 4) a duty-cycle setting circuit for calculating the output duty cycle and 5) TMR system for increasing the reliability.



Fig. 1. (a) Proposed digital pulsewidth-control circuit. (b) TMR system

The function of the system is as follows. The period of the input signal is determined by delay lines, which are then controlled by the duty-cycle setting circuit to generate the final output. In the proposed pulsewidth control circuit, the input clock is divided by 2 to get a reference signal [REF in Fig.1]. The one-shot circuit is used to produce a pulse train matching the frequency of output clock. In the first stage, multiplexer (MUX) delivers the REF to the CDL for the pulsewidth detection. After the detection of pulsewidth, MUX incorporates the output of one-shot circuit into the matching delay line (MDL) to produce the output.

The CPI circuit is used to determine the pulsewidth of the REF signal to control the 16-to-4 MUX1. The MUX1 then enables four output paths. The coarse detector compares the four MUX1 outputs with the REF to decide which of the MUX2 input paths to enable. The fine detector detects the three delay paths in FDL that is closest to REF and produces the output (Bf1, Bf2). The same circuit is used again to determine the final output clock after the detection of pulsewidth of input signal. The duty-cycle setting circuit calculates the desired duty cycle with the duty cycle code given by the programmer in conjunction with the outputs of coarse detector and fine detector. The extra delay caused by MUX1 and MUX2 is compensated by using a MDL. The output of digital pulsewidth control circuit with programmable duty cycle is given to a TMR system to improve the performance of the system and produces a reliable output.

International conference on Recent Innovations in Engineering (ICRIE'14) Sri Subramanya College of Engineering and Technology, Palani

#### 3.1 CPI Circuit

# 3. MAIN BUILDING BLOCKS

The CPI circuit is used to find the pulsewidth of REF, which is equal to the period of the input signal. The CPI circuit and an example of a timing diagram is shown in Fig 2. The divided REF signal is given to both CPI and CDL circuit. The CPI also receives three output signals (Out 4, Out8 and Out12) from the CDL. The pulses of three output signals trigger three D flip-flops. Initially F1, F2, F3, F4 and FC\_FINISH are set to 10000. The input pulsewidth is assumed between 8 and 12 coarse delays as shown in Fig. 2(b). The Out4 signal triggers a corresponding D-Flip-flop. Then the pulsewidth code F1 falls low and F2 rises high. The Out8 also triggers a D-Flip-flop then the F2 falls low and F3 rises high. When REF signal becomes low FC-FINISH is set to high to complete the detection. Therefore, Out12 does not trigger the final D-Flip-flop and F3 and F4 do not change their states. Finally the codes (F4 to F1) are set to 0100 and sent to the CDL and MUX1.

The CPI circuit has two main functions. First, the decoder complexity is reduced by reducing the number of detectors required in the CDL. Second, when the detection of the input signal is finished, the CDL, MUX1, MUX2, and FDL are reused to generate the falling edge of the output signal. The CPI circuit turns off unused delay cells to save power. For example, when the input signal has a period less than 61c, coarse delay cells C7 to C16 remain unused.

#### 3.2 CDL and Coarse Detector

The CDL comprises 15 tri-state delay cells. Each cell has a delay of Tc. The Fig. 3 shows the CDL and Coarse detector. The CDL is divided into four groups: C1 to C3, C4 to C7, C8 to C11, C12 to C15, and one matching delay cell C16. MUX 1 selects one signal (Input, Out 4, Out8, or Out12), and sends it to the coarse detector and MUX2. MUX1 also selects another signal (Out1, Out5, Out9, or Out13), and sends it to the coarse detector and MUX2, as well. The same selection also applies to (Out2, Out6, Out10, and Out14) and (Out3, Out7, Out11, and Out15). If the pulsewidth of the reference signal is greater than 8Tc and smaller than 12Tc, the CPI circuit detects it and generate codes F4 to F1 of {0100}, and directs the MUX1 to enable the four output paths (Out8 to Out11) of delay cells C8 to C11. A thermometer-to-binary encoder then converts the digital code from the coarse detector and CPI circuit to binary code (BC4 to BC1). If the pulsewidth falls between Out11 and Out12, the coarse detector codes A4 to A1 as {1000}, and the pulsewidth codes F4 to F1 from CPI are coded as {0100}. The final output binary code of the coarse detector, Bc4 to Bc1, equates to {1011}, which is equal to the number of coarse delay cells closest to the REF pulsewidth.





(b)

Fig. 2. (a) CPI Circuit. (b) One example of the time diagram of CPI circuit.

#### 3.3 FDL and Fine Detector

The FDL comprises three tri-state delay cells. Fig. 4 presents the FDL and fine detector. Each cell has a delay If, which in our design is equal to one quarter of Ic. Rather than a parallel structure, a serial structure is employed to improve the time resolution of the FDL. A serial structure enables a decrease in the fan out of the delay cells because phase detection need to be performed only on the last delay cell, instead on each of them. The use of only three delay cells enables the FDL to decrease detection time and improve time resolution.

Initially the values of Q4 to Q1 are set to 0001 in our example as shown in Fig. 4. The input signal from the CDL first travels through path1. Then the signal has been delayed by 3l'f compared with REF1. REF1 is a replica of REF. The pulsewidth of REF is 3l'f smaller than the detected results of the CPI and CDL circuit. Following the detection of path1, the fine detector enables path2 and the comparison of path2 of the delay line and REF1 is to be repeated. If the Input\_buf still lags REF1, the fine detector continues on to path3. If REF1 triggers the D-Flip-flop in this state, it means the Input\_buf leads the REF. Once the detection is complete, the results are given to the duty-cycle setting circuit. In our example, Input\_buf leads REF1 when path3 is enabled. Then the Q4 to Q1 = { 0111}, and the final output codes Bf2 to Bf1 of the fine detector become {01}, which is equal to the number of fine delay cells closest to the pulsewidth of REF.



International conference on Recent Innovations in Engineering (ICRIE'14) Sri Subramanya College of Engineering and Technology, Palani



Fig 4: (a) FDL and Fine Detector Circuit (b) One example of timing diagram

## 3.4 Duty-Cycle Setting Circuit

The duty cycle setting circuit is shown in Fig. 5. The detected results of the coarse detector and fine detector are converted to a 6-bit binary code by the thermometer-to-binary encoder. The binary code is then sent to the duty-cycle setting circuit, which calculates the corresponding results based on the duty-cycle setting codes provided by the programmer. The binary codes (BC4 to BC1) are the output from the coarse detector and BF2, BF1 is the output from the fine detector.

Using the duty-cycle setting codes {abcd}, the duty cycle of the output clock can be set to  $a \times Bc4Bc3Bc2Bc1Bf 2.B f 1(50\%) + b \times Bc4Bc3Bc2Bc1.Bf 2B f 1(25\%) + c \times Bc4Bc3Bc2.Bc1Bf 2B f 1(12.5\%)$   $+d \times Bc4Bc3.Bc2Bc1Bf 2B f 1(6.25\%)$ (a, b, c, d = 0 or 1).



Fig 6: Output clock generator

The control circuit uses the results of the duty-cycle setting codes to determine which MUX1 and MUX2 paths should be enabled. The pulse train then passes through two delay lines and resets the D flip-flop of the output clock generator to generate the falling edge. This operation is repeated to produce the final output clock. The output clock generator is shown in Fig.6. The implementation of the duty-cycle setting circuit uses shift registers to express the division of the code: one shift corresponds to (1/2), two shifts correspond to (1/4), and so on.

Because bits [4:9] corresponds to an integer of fine delay cells, bits [0:3] represents the decimal number of a fine delay cell. Bits [0] and [1] would not influence the operation of the overall circuit; therefore, they are both overlooked during this calculation. As a result, we require only a 6-bit adder and a 7-bit adder, as shown in the Fig.5. The duty-cycle setting circuit then adds the codes to generate the final results using full adders controlled by the setting codes.

## 4. EXPERIMENTAL RESULTS

Result analysis is done by observing the synthesis report in Xilinx. The program is written in VHDL language. The key advantage of VHDL, when used for systems design, is that it allows the behavior of the required system to be described and verified before synthesis tools translate the design into real hardware. A VHDL project is multipurpose. Being created once, a calculation block can be used in many other projects and it is portable.

The duty cycle setting circuit is used to calculate the duty cycle depending upon the duty cycle setting code given by the programmer. The implementation of the duty-cycle setting circuit uses shift registers to express the division of the code.

# *IOSR Journal of Computer Engineering (IOSR-JCE) e-ISSN: 2278-0661, p-ISSN: 2278-8727 PP 29-36*

# www.iosrjournals.org

| Logic Distribution                                   |     |     |      | Logic Distribution                                   |     |     |      |
|------------------------------------------------------|-----|-----|------|------------------------------------------------------|-----|-----|------|
| Number of occupied Sices                             | 39  | 192 | 20%  | Number of occupied Silces                            | 39  | 192 | 20%  |
| Number of Slices<br>containing only related<br>logic | 39  | 39  | 100% | Number of Slices<br>containing only related<br>logic | 39  | 39  | 100% |
| Number of Slices<br>containing unrelated logic       | 0   | 39  | 0%   | Number of Slices<br>containing unrelated logic       | 0   | 39  | 0%   |
| Total Number 4 input<br>LUTs                         | 52  | 384 | 13%  | Total Number 4 input<br>LUTs                         | 48  | 384 | 12%  |
| Number used as logic                                 | 51  |     |      | Number used as logic                                 | 47  |     |      |
| Number used as Shift<br>registers                    | 1   |     |      | Number used as Shift<br>registers                    | 1   |     |      |
| Number of bonded <u>IOBs</u>                         | 5   | 86  | 5%   | Number of bonded IOBs                                | 5   | 86  | 5%   |
| IOB Flip Flops                                       | 1   |     |      | IOB Flip Flope                                       | 1   |     |      |
| Number of GCLKs                                      | 1   | 4   | 25%  | Number of GCLKs                                      | 1   | 4   | 25%  |
| Number of GCLKIOBs                                   | 1   | 4   | 25%  | Number of GCLKIOBs                                   | 1   | 4   | 25%  |
| Total equivalent gate<br>count for design            | 632 |     |      | Total equivalent gate<br>count for design            | 608 |     |      |
| Additional JTAG gate<br>count for 10Bs               | 288 |     |      | Additional JTAG gate<br>count for IOBs               | 288 |     | 1    |

Fig 7: Comparison of previous PWCC and proposed PWCC

The modified Pulsewidth-control circuit has high immunity to noise and short locking time. It requires only 7-11 reference cycles for the circuit to be locked. Fig. 7 shows the comparison of previous PWCC and proposed high performance digital pulsewidth control circuit with programmable duty cycle. The total equivalent gate count for designing the proposed circuit has been considerably reduced thereby reducing the area and power consumption of the circuit.

The output of the proposed high performance digital PWCC is given to a TMR system to increase the reliability of the system. The result analysis of the reliable PWCC is done by observing the simulation results using ModelSim 6.3f. The output of reliable PWCC is shown in Fig.8. ModelSim is the simulator of choice for both ASIC and FPGA design. In TMR, three identical logic circuits are used to compute the same set of specified Boolean function. If there are no circuit failures, the outputs of the three circuits are identical. But due to circuit failures, the outputs of the three circuits may be different. A majority gate is used to decide which of the circuit's output is correct. It is processed by majority voting system to produce a single output.



Fig 8: Output of reliable PWCC

#### 5. CONCLUSION

To meet the demand for high speed operations, an all-digital pulsewidth-control circuit with high performance is developed. The modified Pulsewidth-control circuit has high immunity to noise and short locking time. The area and power consumption is reduced when compared with the previous all-digital PWCC. It requires only 7-11 reference cycles for the circuit to be locked. It has a new duty-cycle setting circuit to produce output duty cycles from 31.25% to 68.75% in increments of 6.25% without the need for a look-up table. The new design is developed in hardware description language and implemented with standard cell libraries, therefore easily portable between technologies. The reliability of the circuit is increased by using TMR system. Based on the proposed architecture, not only can the duty cycle of the output clock be assured but high performance of the circuit can also be achieved.

#### 6. ACKNOWLEDGMENT

I, VANDANA.M, student of M.E APPLIED ELECTRONICS, Dept. of ECE, Dhanalakshmi Srinivasan College of Engineering, Coimbatore would like to express thanks to Asst. Prof. Ms. V.DEEPIKA for her encouragement and constant co-operation throughout the completion of the paper. I deeply express my gratitude to all the ECE department staffs for their valuable advice and co-operation.

#### REFERENCES

- [1]. Jun-Ren Su, Te-Wen Liao and Chung-Chih Hung, "All-Digital fast- locking pulsewidth-control circuit with programmable duty cycle," IEEE Transactions on Very large scale integration (VLSI) systems, vol.21,no. 6, June 2013.
- [2]. P. H. Yang and J. S. Wang, "Low-voltage pulsewidth control loops for SoC applications," *IEEE J. Solid-State Circuits*, vol. 37, no. 10, pp. 1348–1351, Oct. 2002.
- [3]. W.-M. Lin and H.-Y. Huang, "A low-jitter mutual-correlated pulsewidth control loop circuit," *IEEE J. Solid- State Circuits*, vol. 39, No.8, pp. 1366–1369, Aug. 2004.
- [4]. Y.-J. Wang, S.-K. Kao, and S.-I. Liu, "All-digital delay- locked loop/pulsewidth-control loop with adjustable duty cycles," *IEEE J Solid-State Circuits*, vol. 41, no. 6, pp. 1262–1274, Jun. 2006.
- [5]. S.-R. Han and S.-I. Liu, "A single-path pulsewidth control loop with a built-in delay-locked loop," *IEEE J. Solid-State Circuits*, vol. 40, no.5, pp. 1130–1135, May 2005.
- [6]. F. Mu and C. Svensson, "Pulsewidth control loop in high-speed CMOS clock buffers," *IEEE J. Solid-State Circuits*, vol. 35, no. 2, pp. 134–141, Feb. 2000.